# 4bit quantization
Llama 3.3 70B Instruct 4bit DWQ
4-bit DWQ quantized version of the Llama 3.3 70B instruction-tuned model, optimized for efficient inference on the MLX framework
Large Language Model Supports Multiple Languages
L
mlx-community
140
2
Gemma 3 27b It 4bit DWQ
This is a 4-bit quantized version converted from the Google Gemma 3 27B IT model, specifically optimized for the MLX framework.
Large Language Model
G
mlx-community
102
1
Qwen3 14B 4bit AWQ
Apache-2.0
Qwen3-14B-4bit-AWQ is an MLX-format model converted from Qwen/Qwen3-14B, using AWQ quantization technology to compress the model to 4bit, suitable for efficient inference on the MLX framework.
Large Language Model
Q
mlx-community
252
2
3b De Ft Research Release 4bit
Apache-2.0
This is a German text-to-speech model based on MLX format conversion, supporting German language processing tasks.
Speech Synthesis
Transformers German

3
mlx-community
19
0
Deepseek R1 Distill Qwen 32B 4bit
This is the MLX 4-bit quantized version of the DeepSeek-R1-Distill-Qwen-32B model, designed for efficient inference on Apple silicon devices
Large Language Model
Transformers

D
mlx-community
130.79k
40
Featured Recommended AI Models